Picture for Yifan Yang

Yifan Yang

From Word to Sentence: A Large-Scale Multi-Instance Dataset for Open-Set Aerial Detection

Add code
May 06, 2025
Viaarxiv icon

Empowering Agentic Video Analytics Systems with Video Language Models

Add code
May 02, 2025
Viaarxiv icon

Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models

Add code
May 01, 2025
Viaarxiv icon

Zoomer: Adaptive Image Focus Optimization for Black-box MLLM

Add code
Apr 30, 2025
Viaarxiv icon

The Fourth Monocular Depth Estimation Challenge

Add code
Apr 24, 2025
Viaarxiv icon

EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting

Add code
Apr 22, 2025
Viaarxiv icon

Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis

Add code
Apr 14, 2025
Viaarxiv icon

Hyperlocal disaster damage assessment using bi-temporal street-view imagery and pre-trained vision models

Add code
Apr 12, 2025
Viaarxiv icon

Adaptive Bounded Exploration and Intermediate Actions for Data Debiasing

Add code
Apr 10, 2025
Viaarxiv icon

Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions

Add code
Apr 03, 2025
Viaarxiv icon